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Abstract 

The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative 
measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed 
model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability 
of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. 
paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was 
defined based on a generalized linear mixed model for negative binomial distributed data. The current study 
reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental 
samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed 
better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative 
binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of 
ICC values showed that the mean of estimated ICC closely approximated the true ICC. 

Keywords: Intraclass correlation coefficient; Generalized linear mixed model; Negative binomial mixed model; 
Variance components 



Introduction 

In the simple case of estimating the correlation among 2 
factors with a set of quantitative observations, an investi- 
gator may elect to utilize the Spearman Rank correlation 
coefficient or Pearson's correlation coefficient assuming 
the observations are independent. The measure of agree- 
ment k can be estimated for correlation between binary 
observations. For more complex data structures that 
may include either crossed or nested factors of a latent 
character, the investigator may utilize the Intraclass 
Correlation Coefficient (ICC). The ICC is related to unex- 
plained variance at the subject level. More specifically, the 
ICC is defined as the ratio of the covariance of measure- 
ments from the factor of interest to the marginal variance 
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of the observations. Ranging between 0 and 1, an ICC 
close to 1 indicates that the difference in observations due 
to the factor of interest are ignorable. Hence, using vari- 
ance estimates attributable to each of a study's factors, the 
ICC can be used as a measure of similarity in observations 
between subjects due to a particular factor. A direct appli- 
cation of the ICC is a measure of the correlation between 
subjects in a reliability and repeatability gauge study (Aly 
et al. 2009; Kittawornrat et al. 2012). 

Investigators analyze and obtain variance estimates for 
normally distributed data using linear mixed models 
(LMM) or non-normally distributed data using gener- 
alized linear mixed models (GLMM). Health science re- 
searchers more commonly work with count data and 
while the ICC for the LMM has been extended to the 
Poisson case (Carrasco and Jover 2005), its equivalence for 
count data with overdispersion was only recently described 
(Carrasco 2010). Until the ICC for negative binomial dis- 
tributed data was developed, researchers transformed such 
data using different transformations to make their data 
normally distributed in order to use LMM and their ICC. 



Springer 



© 2014 Aly et al.; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons 
Attribution License (http://creativecommons.0rg/licenses/by/2.O), which permits unrestricted use f distribution, and reproduction 
in any medium, provided the original work is properly cited. 



Aly et al. SpringerPlus 2014, 3:40 
http://www.springerplus.eom/content/3/1/40 



Page 2 of 7 



An example of count data that may commonly be 
overdispersed is bacterial culture results. Culture results 
are commonly reported as colony forming units per spe- 
cimen mass or culture medium tube. Another example 
is parasite counts which are commonly reported as para- 
sitic stage count per gram of specimen. Given the nature 
of such infectious agents, they can exist in very large 
numbers within their hosts, at the same time not all po- 
tential hosts in a population are infected. In fact, more 
hosts tend to be uninfected leading to the inequality of 
the mean and variance of the data, hence overdispersion. 
In the current study, we report on a reliability analysis 
for environmental sampling to quantify Mycobacterium 
avium subspecies paratuberculosis (MAP) on California 
free-stall dairies (Aly et al. 2009). A previous study with 
these data was unique in that it involved the use of 
nested and crossed factors and used the natural loga- 
rithm to attain normally distributed data for a LMM 
analysis and ICC estimation. Such transformations may 
normalize the data provided the number of replicates 
was large and the variance components were small 
(Solomon and Taylor 1999). Both sample size and mag- 
nitude of variance conditions may be difficult to attain 
with negative binomial distributed data especially when 
replicates are limited due to cost or subject use limi- 
tations such as in health sciences research. The perform- 
ance of the negative binomial ICC has not been compared 
to LMM ICC using previously described data transforma- 
tions in multilevel models with crossed and nested random 
effects. 

Hence, the objectives of this study were to specify a 
negative binomial mixed model and estimate and con- 
trast the performance of the resulting ICC to that based 
on estimates from linear mixed models of several data 
transformations. In addition to the reliability study on 
environmental sampling to quantify MAP in dairy pens, 
a wide variety of negative binomial distributed data was 
simulated to contrast estimator performance. 

Methods 

ICC for the negative binomial mixed model 

For the purpose of deriving the ICC that estimates the 
similarity of samples collected by two veterinarians 
on the same day from the same pen. Here y^i de- 
notes the observed value of the f pen in i th dairy, 
the k th day, and the I th veterinarian (i = 1, 2,..m; j = 1, 
2,..n m ; k=l, 2,..s; 1=1, 2,..t); the total number of ob- 
servations is N = stY.n m . We assume that the condi- 
tional distribution of y^i given the random effects a, 
b, c, d (dairy, pen, day and veterinarian, respectively) 
is distributed negative binomial with the number of 
successes needed r and probability of success pyki, or 
NB(r; Pijki). In this distribution, r is fixed for all y»a. 



Furthermore, it is assumed that (i^ = er +ai+b 'i k+Ck+di , 
where fiya is the conditional expectation of y^i given 
Pijia. Recall p ijH = (Casella and Berger 2002), 

where «; (z'=l, 2,...,m) are independent and distrib- 
uted as N(0; a 2 a ), bt (z = 1,2, ...n m ) are independent 
and distributed as N(0; al), c, (i = l,2,..s) are inde- 
pendent and distributed as N(0; a 2 ), and d y (i = 1, 2,.. 
t) are independent and distributed as N(0; a 2 d ). Hence 
the conditional expectation 

_ rjl-Pijki) 

Pijkl 

and the conditional probability 
r r 

Pm = Pijkl + r = eP +a ' +b v+ Ck + d ' + r 

thereby the conditional distribution of y^i is NB(r, r/ 
^ e fi+a i +b ij +c k +di _|_ r y\ -which i s a special case of the 

GLMM. The ICC for the similarity in Herrold's egg yolk 
medium (HEYM) culture results for MAP in samples 
collected by 2 different collectors (lj and l 2 ) will be de- 
rived as an example. Given the model assumptions and 
study design, y„ Wi and y i j ldi are conditionally independ- 
ent if I j * l 2 , so the conditional expectation of their 
product is the product of their conditional expectations. 
Therefore, 

E (y i jki 1 y i jki 2 ) = E [ E {y i jki 1 y i jki 2 \a,b,c,d)^ 

= E [ E {yijki, h b > c > d ) E (yijki 2 h b > c < d )] 

= E (PijkhPijkl 2 ) 

= E(eP +a ' +bi ' +Ck+dl1 e P+ a '+ b <i+ c i<+ d h ^ 
= E(e 2 ( /1+ai+bi ' +Ck ') +d ' 1+d ' 2 ) 

The random variable 2(/S + a, + by + cA + di t + di 2 
has the distribution N (2/3, 4rj2 + 4 ff 2 + 4cr 2 + 2o -2^ nence 
e 2^ +ai+ b,+c k )+d h +d h has the d is tr ibution log-normal 
(2^,4^+4^+4^ + 2^). 

According to the expectation of the log-normal distri- 
bution, we have: 

_ e 2p+2ol+2o 2 b +2o 2 c +o 2 d 

Similarly, 

The covariance between two measurements that are 
generated by different veterinarians but otherwise are 
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identical in all factors is the difference between the 
expectation of their product and the product of their 
expectations, that is: 

Cov (y m , y ijkh ) = E (y ijkli y ijkh ) -E (y ijkh ) E (y m ) 

On the other hand, according to the variance of the 
log-normal distribution, 

Var[E(y m \p,at,bq,c k ,dt)] = Var(eP+ a <+ h »+«+< l >) 

Hence, the expectation of conditional variance of the 
observations can be expressed as: 



E[Var{y ijkl \p,ai,b ih c k ,di)} 

E 2 {y V kt\Pi a iMj,Ck,di) 



E {yijki\Pi a h b iji c kdi) 



, 2fi+2a i +2b ij +2ci c+ 2d, \ 
. £ [ |_ e IS+a i +b ij +c t +d, \ 



Therefore, the variance of the observations is: 

Var (yijki) = Var\E(y im \p,ai,by,c k ,di) 
+E[Var(y ijkl \p, a h b ih c k ,di)] 

_ e ip+2a 1 a +2a 2 b +2cr 2 c +2a 2 il _ e lfi+a 2 a +al+a 2 c +a 2 d 
e 2^+2 a 2 a +2a 2 b +2a 2 c +2a 2 d » + 

+ + e 2 



It follows then that the ICC for the negative binomial 
mixed model for the similarity between samples collected 
by two different veterinarians on the same day and from 
the same pen is: 



P = 



Cov iynki^yijkk 



Var 



e a a +a b +a c +a d _p_ 



When the variance of data is much larger than its ex- 
pectation, the negative binomial distribution is often 
used as an alternative to the Poisson distribution. The 



random effects follow the normal distribution and the 
link function is the logarithm. Based on this formula, the 
ICC is no longer just based on the random effects, but 
also the fixed effect intercept and the number of suc- 
cesses. Thus, the negative binomial mixed model may be 
more reasonable than the LMM or the Poisson GLMM 
when count data are overdispersed. 



Simulations 

Simulations were used to compare the performance of 
the new negative binomial ICC to that estimated after 
traditionally transforming count data to normalize it 
using transformation such as the logarithm, square root, 
square, or their inverse values (Carrasco and Jover 2005). 
To compare the performance of the ICC estimator derived 
for the negative binomial GLMM to the ICC used in trad- 
itional methods such as LMM of normalized count data, 
16 scenarios were generated. Fixed estimates of input pa- 
rameters were used in each of the 16 scenarios and their 
respective true ICC as summarized in Table 1. The scenar- 
ios included 2 different estimates of r (r = 1 and r = 2), 
numbers of successes. In addition, zero and non-zero inter- 
cepts (|3 =0 and [3 =2), 2 different between-dairy variance 



Table 1 Parameters of a simulation to compare the true 
and estimated negative binomial Intraclass Correlation 
Coefficient (ICC) using an example of culture results for a 
specific bacterium in pen floor samples (variance 0.5) 
collected over several days apart and simultaneously by 
different veterinarians and across different dairies 



Scenario 


r 


P 


Dairy 


Pen 


Variance 
Day Veterinarian 


E(Y) 


True 
ICC 


1 


1 


0 


0.5 


0.5 


0.2 


0.1 


1.92 


0.3382 


2 


1 


0 


1 


0.5 


0.2 


0.1 


2.46 


0.3888 


3 


1 


2 


0.5 


0.5 


0.2 


0.1 


14.15 


0.362 


4 


1 


2 


1 


0.5 


0.2 


0.1 


18.17 


0.401 1 


5 


2 


0 


0.5 


0.5 


0.2 


0.1 


1.92 


0.4616 


6 


2 


0 


1 


0.5 


0.2 


0.1 


2.46 


0.5275 


7 


2 


2 


0.5 


0.5 


0.2 


0.1 


14.15 


0.5072 


8 


2 


2 


1 


0.5 


0.2 


0.1 


18.17 


0.5503 


9 


1 


0 


0.5 


0.5 


0.2 


0.5 


2.34 


0.2236 


10 


1 


0 


1 


0.5 


0.2 


0.5 


3 


0.2574 


11 


1 


2 


0.5 


0.5 


0.2 


0.5 


17.29 


0.2319 


12 


1 


2 


1 


0.5 


0.2 


0.5 


22.2 


0.2617 


13 


2 


0 


0.5 


0.5 


0.2 


0.5 


2.34 


0.3037 


14 


2 


0 


1 


0.5 


0.2 


0.5 


3 


0.3476 


15 


2 


2 


0.5 


0.5 


0.2 


0.5 


17.29 


0.3192 


16 


2 


2 


1 


0.5 


0.2 


0.5 


22.2 


0.3556 
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estimates (0.5 and 1), and 2 different between-veterinarian 
variance estimates (0.1 and 0.5) were assumed. The justifi- 
cation behind the use of fixed estimates for the between- 
pen and between-day variances is that by equation (1), 
these variances influence the ICC in the same way, as the 
between-dairy variance; therefore it is reasonable to vary 
only one of them. Based on the study by Aly et al. (2009) 
there were 4 factor levels: dairy, pen, veterinarian and day. 
Pens were nested within dairy. In dairy i , i= 1, 2, 3, 4 
has / pens; where for i=l, j=l,..., 8; for i = 2, j=l,..., 
11; for i = 3, j=l,..., 7; and for i = 4, j=l,..., 4. Pens 
were cross-classified by veterinarian /; 1=1, 2; and day k; 
k = 1, 2, 3. Data were generated under the assumption 
of negative binomial GLMM with log link using all four 
factors a, b, c, d included as random effects. Each sam- 
ple dataset consisted of 180 observations. Each simula- 
tion followed the following procedure: 

1. Randomly generate normal random effects bij, Ck, 
di(i = 1, 2,.. n m ; k=l, 2.. s; I = 1, 2,.. t) with respective 
scenario's variances 

2. Sum the intercept and random effects as conditional 
expectation fi^ = ^+ a i+ b ii+ c k+d t > ^ is estimated 
intercept from field data 

3. Randomly generate negative binomial variable Yga ~ 
NB{r, fiiju) r is number of successes 

4. Estimate model parameters: intercept />, number of 
successes r and random effects a « +a b +cr c+< J d 

5. Calculate the ICC 



One hundred simulated data sets were generated under 
each scenario. For each simulated data set, the ICC was 
estimated using four different methods: 1) the negative 
binomial GLMM, 2) LMM of raw data (untransformed); 
3) LMM of square root transformed data; and 4) LMM of 
logarithm transformed data where taking logarithm of 
zero was avoided by replacing zeros with 0.5. For LMM, 
restricted maximum-likelihood estimation (REML) was 
used, while maximum-likelihood (ML) estimation was 
used for the GLMM. Relative bias, variance of the ICC, 
and mean square error (MSE) of the ICC estimate were 
calculated to evaluate the performance of the ICC. The 
relative bias was calculated as the difference between the 
mean of estimated ICC and it's true value, variance was cal- 
culated by unbiased estimation based on the simulation, 
and MSE was calculated as the sum of squared bias 
and variance. 

A second simulation explored the performance of the 
ICC estimate over a wider range. The mean estimated ICC 
was computed using 400 simulations per combination 
of number of successes (r = 5 and r = 30) and variance 
estimates for dairy and veterinarian (0 to 1 in incre- 
ments of 0.2). 



Field data analysis 

Finally, field data used in the report by Aly et al. (2009) 
were analyzed using the negative binomial GLMM. Briefly, 
environmental samples were collected every other day on 3 
different occasions from 4 California dairies between No- 
vember 2006 and June 2007. Samples were cultured using 
bacterium-specific medium using standard microbiological 
procedures as reported by Aly et al. (2009). Confidence in- 
tervals for model parameters were obtained based on par- 
ameter estimates from the field data and using parametric 
bootstrap similar to that described in Table 1 (Efron and 
Tibshirani 1993). The resulting negative binomial based 
ICC was contrasted to that estimated from transformed 
data and reported previously by Aly et al. (2009). The R 
package lme4 was used for LMM analysis, and the package 
glmmADMB for GLMM analysis. All packages were 
loaded in the R 2.15.1 environment. 

Results 

Results of the first simulation targeted a range of ICC 
values based on 16 combinations of input parameters (r, p, 
variances of dairy, pen, veterinarian and day) and are pre- 
sented in Table 2. The relative bias in the ICC, variance and 
MSE were compared for the ICC estimates based on the nega- 
tive binomial model to those based on the LMM of raw (un- 
transformed) and transformed data. The negative binomial 
model ICC had the least absolute relative bias in 5 of the 16 
scenarios (3, 4, 5, 6 and 8) that were characterized by small 
variance estimates for veterinarian (0.1). In comparison, the 
ICC based on LMM of raw data had the most number of sce- 
narios with the least absolute relative bias (9, 10, 11, 12, 14, 15 
and 16) characterized by large variance estimates for veterin- 
arian (0.5). In terms of variance, the negative binomial ICC 
had the most number of scenarios with the least variance (1 to 
5, 7, 8, 11, 12, 15). Similarly for MSE, the ICC based on the 
negative binomial model had the least MSE in 11 of the 16 
scenarios (1 to 8, 11, 12, 15 and 16). 

The second simulation performed to investigate the ef- 
fect of larger number of successes (r = 5 and r = 30) and 
a wider range of variance estimates for dairy and veterin- 
arian that also include zero. Figure 1 showed that the 
mean of the estimated ICC and the true ICC were simi- 
lar as estimates of variance due to veterinarian ranged 
from 0.1 to 0.3 even as variance due to dairy increased 
to 1. However, as depicted in the diverging planes, the 
difference between the estimated and true ICCs in- 
creased towards extreme variance estimates. Both be- 
haviors were consistent in a higher number of successes 
(r = 30). Figure 1 depicts the differences between the true 
ICC and the mean of the respective estimated ICC based 
on simulations. 

Results of the negative binomial GLMM are summa- 
rized in Table 3. The negative binomial ICC was esti- 
mated to be 0.5207 (95% CI = 0.4033, 0.6091) compared 
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Table 2 Point estimate (PE) relative bias, variance, and mean square error (MSE) of Intraclass Correlation Coefficient (ICC) for 
culture results of samples collected by 2 veterinarians and based on the negative binomial mixed model, linear mixed model 
with raw data, square-root transformed data and log-transformed data (bold values are nearest to zero within a row) 

Scenario Parameter Negative binomial Transformed data 









Raw 


Natural logarithm 


Square root 


1 


PE relative bias% 


-10.35 


-16.14 


-5.41 


-5.32 




Variance 


0.0098 


0.0138 


0.0145 


0.0137 




MSE 


0.011 


0.0168 


0.0148 


0.014 


2 


PE relative bias% 


-10.8 


-17.21 


-1.13 


-2.55 




Variance 


0.0108 


0.0136 


0.0198 


0.0183 




MSE 


0.0126 


0.0181 


0.0198 


0.0184 


3 


PE relative bias% 


-5.33 


-7.65 


8.45 


12.43 




Variance 


0.0052 


0.0118 


0.0115 


0.012 




MSE 


0.0056 


0.0126 


0.0124 


0.014 


4 


PE relative bias% 


-9.3 


-16.93 


9.75 


9.7 




Variance 


0.0067 


0.0107 


0.0205 


0.0152 




MSE 


0.0081 


0.0153 


0.022 


0.0167 


5 


PE relative bias% 


-8.28 


-18.37 


-10.46 


-10.92 




Variance 


0.0114 


0.0135 


0.0133 


0.0136 




MSE 


0.0129 


0.0207 


0.0156 


0.0161 


6 


PE relative bias% 


-19.51 


-30.33 


-21.06 


-22.33 




Variance 


0.0148 


0.0138 


0.0162 


0.0161 




MSE 


0.0254 


0.0394 


0.0285 


0.03 


7 


PE relative bias% 


-8.02 


-14.27 


0.41 


2.54 




Variance 


0.0095 


0.012 


0.0158 


0.0122 




MSE 


0.0112 


0.0172 


0.0158 


0.0124 


8 


PE relative bias% 


-5.89 


-15.66 


9.03 


7.11 




Variance 


0.009 


0.0107 


0.016 


0.0131 




MSE 


0.01 


0.0181 


0.0185 


0.0146 


9 


PE relative bias% 


17.53 


3.26 


27.01 


26.74 




Variance 


0.0129 


0.0105 


0.0126 


0.0118 




MSE 


0.0144 


0.0106 


0.0162 


0.0154 


10 


PE relative bias% 


8.55 


-0.51 


31.12 


27.35 




Variance 


0.0165 


0.0145 


0.025 


0.0225 




MSE 


0.017 


0.0145 


0.0314 


0.0275 


11 


PE relative bias% 


30.36 


27.17 


62.05 


66.58 




Variance 


0.0104 


0.0134 


0.0144 


0.015 




MSE 


0.0154 


0.0174 


0.0351 


0.0388 


12 


PE relative bias% 


19.56 


16.28 


57.01 


55.29 




Variance 


0.0118 


0.0157 


0.0193 


0.0183 




MSE 


0.0144 


0.0175 


0.0416 


0.0392 


13 


PE relative bias% 


13.24 


7.84 


18.41 


18.51 




Variance 


0.0213 


0.0185 


0.0176 


0.0183 




MSE 


0.0229 


0.0191 


0.0207 


0.0215 


14 


PE relative bias% 


7.22 


-2.79 


17.15 


15.39 




Variance 


0.0217 


0.0172 


0.027 


0.0255 




MSE 


0.0223 


0.0173 


0.0306 


0.0284 
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Table 2 Point estimate (PE) relative bias, variance, and mean square error (MSE) of Intraclass Correlation Coefficient (ICC) for 
culture results of samples collected by 2 veterinarians and based on the negative binomial mixed model, linear mixed model 
with raw data, square-root transformed data and log-transformed data (bold values are nearest to zero within a row) 

(Continued) 

15 PE relative bias% 28.41 23.59 45.99 48.25 

Variance 0.0154 0.0182 0.0178 0.0188 

MSE 0.0236 0.0239 0.0394 0.0425 

16 PE relative bias% 22.69 19.99 51.97 50 

Variance 0.0216 0.0205 0.0262 0.023 

MSE 0.0281 0.0256 0.0604 0.0546 



to the estimate based on natural log transformed data 
which was 0.6730 (95% CI = 0.5130, 0.8340). 



Discussion 

The current study updates an earlier report on the reli- 
ability of environmental sampling to quantify MAP in 
freestall dairy pens utilizing the negative binomial ICC 
for count data. A unique character of the negative bino- 
mial ICC is the inclusion of the fixed effect intercept es- 
timate unlike the ICC based on LMM which is based 
soley on variance components. Fixed effects are similarly 
included in the formula for the poisson ICC however the 
negative binomial ICC also includes r, the distribution 
parameter for number of successes. A performance com- 
parison of the ICC estimates showed that the negative 
binomial ICC was more suitable for count data that is 
overdispersed given the smaller MSE and variance esti- 
mate than the ICC from LMM. Relative bias tended 
to the least in more scenarios (7 out of 16) with LMM 
compared to the GLMM based ICC. The lower relative 
bias with LMM may be explained by the use of REML 
estimation. The choice of MLE for GLMM was justified 
by that REML for GLMM has not been well defined, un- 
like for LMM (Jiang 2007). Nevertheless, the ICC for the 



Table 3 Parameter estimates from a negative binomial 
generalized linear mixed model for culture results from a 
study on the reliability of an environmental sampling 
protocol and the Intraclass Correlation Coefficient (ICC) 
for similarity in samples collected by two veterinarians 
on the same day and from the same pen 



95% Confidence interval 



Parameter 



Estimate 



Lower 



Upper 



p 


1.9516 


1.3745 


2.601 1 


r 


1.379 


1.0138 


2.0225 


Oa 


0.2691 


2.07E-09 


0.8657 


Ob 


1.352 


0.5786 


2.028 


Oc 


2.1 1 E-09 


2.06E-09 


0.0303 


Od 


4.72E-04 


2.06E-09 


0.0359 


ICC 


0.5207 


0.4033 


0.6091 



a random effect for dairy /, f= 1, 2, 3, 4. 

b random effect for pen j, where for /= 1,/= 1,...,8, for i=2,j= 1,..., 11, for /=3, 
j = !,... ,7 and for i = 4,j= 1,..„4. 

c random effect for day k of sample collection, where k= 1,2,3. 
d random effect for collector /, where /=1, 2 and day k; k = 1, 2, 3. 




True ICC EsimatoilCC 



Figure 1 Performance of the Intraclass Correlation Coefficient (ICC) from a negative binomial mixed model with the number of 
successes r = 5 and =30. The data simulated were for the example of culture results for a specific bacterium in pen floor samples (variance 0.5) 
collected over 3 days 24 hours apart (variance 0.2) and simultaneously by 2 different veterinarians across 4 dairies (0 to 1 in increments of 0.2). 
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negative binomial data outperformed that based on 
LMM of logarithm or square root transformed data with 
respect to MSE and variance. Results of a second simu- 
lation with highly overdispersed data showed that the 
NB ICC tended to overestimate the true ICC with higher 
variance components and under estimate with lower 
variance components. This expected behavior was con- 
sistent in a higher number of successes (r = 30) which 
confirms stability of the estimator over a wide variety of 
negative binomial distributed data. 

Aly et al. (2009) estimated the ICC for similarity in 
HEYM culture results of MAP in samples collected by 
two different collectors on the same day and from the 
same pen to be 67.3%. The current study showed that 
the similarity in culture results estimated using the nega- 
tive binomial ICC could be as low as 52.07%. Such a dif- 
ference is expected given that the culture results are 
overdispersed count data. One reason for overdispersion 
may relate to the culture of MAP on HEYM protocol it- 
self. Specifically fecal slurry samples undergo a decon- 
tamination step to limit bacterial growth on HEYM to 
mycobacteria. The decontamination step also reduces 
the number of MAP organisms resulting in samples with 
low MAP counts which may test negative (zero colony 
forming units) increasing the variance. For this reason, 
quantitative real-time PCR (qPCR) may remain the most 
suitable choice for testing freestall pen environmental 
samples for MAP. 
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